Empirical Policy Evaluation With Supergraphs

نویسندگان

چکیده

We devise algorithms for the policy evaluation problem in reinforcement learning, assuming access to a simulator and certain side information called supergraph. Our explore backward from high-cost states find high-value ones, contrast approaches that work forward all states. While several papers have demonstrated utility of exploration empirically, we conduct rigorous analyses which show our can reduce average-case sample complexity O(S logS) as low O(logS). Analytically, adapt tools network science literature provide new methodology learning problems.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault tolerant supergraphs with automorphisms

Given a basic graph Y and a desired level of fault-tolerance k, an objective in fault-tolerant system design is to construct a supergraph X such that the removal of any k nodes from X leaves a graph containing Y . In order to reconfigure around faults when they occur, it is also required that any two subsets of k nodes of X are in the same orbit of the action of its automorphism group. In this ...

متن کامل

Policy Evaluation and Empirical Growth Research

This paper explores the implications of the vast body of studies of cross-country growth determinants for the evaluation of alternative policies. Empirical growth studies have experienced a remarkable flowering in the last fifteen years, and innumerable insights have unquestionably been uncovered concerning similarities and differences in the growth experiences of various groups of countries. T...

متن کامل

Goal directed policy conflict detection and prioritisation: an empirical evaluation

We address the problem of developing effective automated reasoning support for the detection and resolution of conflicts between plans and policies (or norms). How automated reasoning mechanisms can effectively support human decision makers in this process is little understood. In this research, we have conducted experiments with human subjects to assess how effective these reasoning mechanisms...

متن کامل

On Supergraphs Satisfying CMSO Properties

Let CMSO denote the counting monadic second order logic of graphs. We give a constructive proof that for some computable function f , there is an algorithm A that takes as input a CMSO sentence φ, a positive integer t, and a connected graph G of maximum degree at most ∆, and determines, in time f(|φ|, t) · 2O(∆·t) · |G|O(t), whether G has a supergraph G′ of treewidth at most t such that G′ |= φ...

متن کامل

Nowhere-zero k-flows of Supergraphs

Let G be a 2-edge-connected graph with o vertices of odd degree. It is well-known that one should (and can) add o 2 edges to G in order to obtain a graph which admits a nowhere-zero 2-flow. We prove that one can add to G a set of ≤ b o 4c, d2b o 5ce, and d2b o 7ce edges such that the resulting graph admits a nowhere-zero 3-flow, 4-flow, and 5-flow, respectively.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE journal on selected areas in information theory

سال: 2021

ISSN: ['2641-8770']

DOI: https://doi.org/10.1109/jsait.2021.3073257